Experience Replay is widely used for off-policy reinforcement learning. With cpprb, you can start your experiment quickly without implementing troublesome replay buffer.
Heavy calculation is implemented with C++ and Cython. cpprb is usually faster than Python naive implementation.
cpprb supports Ape-X on single computer. You don’t need to think problematic lock. cpprb locks only critical section internally well.
cpprb adopts flexible environment. Any numbers of Numpy compatible environment values can be stored.
You can build your own reinforcement learning algorithms together with your favorite deep learning library (e.g. TensorFlow, PyTorch).
TF2RL provides a set of reinforcement learning algorithms for TensorFlow 2. TF2RL uses cpprb for off-policy algorithm.
You can find awesome repositories using cpprb. We’re looking forward to seeing your great work will show up.